A Clustering Approach for Achieving Data Privacy

نویسندگان

Alina Campan

Traian Marius Truta

John Miller

Raluca Sinca

چکیده

New privacy regulations together with everincreasing data availability and computational power have created a huge interest in data privacy research. One major research direction is built around k-anonymity property and its extensions, which are required for the released data. In this paper we present such an extension to k-anonymity, called psensitive k-anonymity, which solves some of the weaknesses that the k-anonymity model has been shown to have. We also introduce a new algorithm for enforcing p-sensitive k-anonymity on microdata sets based on a greedy clustering approach. To limit the amount of information loss the proposed algorithm uses cell-level generalization for categorical attributes and hierarchy-free generalization for numerical attributes. Our belief is that the above mentioned algorithm can be adjusted and used to enforce other similar privacy models as well, with better results than the algorithms originally proposed along with these models. Our experiments show that the proposed algorithm efficiently generates the masked microdata with psensitive k-anonymity property.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Entropy-based Consensus for Distributed Data Clustering

The increasingly larger scale of available data and the more restrictive concerns on their privacy are some of the challenging aspects of data mining today. In this paper, Entropy-based Consensus on Cluster Centers (EC3) is introduced for clustering in distributed systems with a consideration for confidentiality of data; i.e. it is the negotiations among local cluster centers that are used in t...

متن کامل

A Hybrid Privacy Preserving Approach in Data Mining

Data mining algorithms extracts the unknown interesting patterns from large collection of data set. Some clandestine or secret information may be exposed as part of the data mining process. In this paper we put forward a hybrid approach for achieving privacy during the mining procedure. The first step is to sanitize the original data using a geometrical data transformation. In the second stage ...

متن کامل

Repeated Record Ordering for Constrained Size Clustering

One of the main techniques used in data mining is data clustering, which has many applications in computer science, biology, and social sciences. Constrained clustering is a type of clustering in which side information provided by the user is incorporated into current clustering algorithms. One of the well researched constrained clustering algorithms is called microaggregation. In a microaggreg...

متن کامل

Privacy Preserving Distributed K-Means Clustering in Malicious Model Using Zero Knowledge Proof

Preserving Privacy is crucial in distributed environments wherein data mining becomes a collaborative task among participants. Critical applications in distributed environment demand higher level of privacy with lesser overheads. Solutions proposed on the lines of cryptography provide higher level of privacy but poor scalability due to higher overheads. Further, existing cryptography based solu...

متن کامل

A centralized privacy-preserving framework for online social networks

There are some critical privacy concerns in the current online social networks (OSNs). Users' information is disclosed to different entities that they were not supposed to access. Furthermore, the notion of friendship is inadequate in OSNs since the degree of social relationships between users dynamically changes over the time. Additionally, users may define similar privacy settings for their f...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2007

A Clustering Approach for Achieving Data Privacy

نویسندگان

چکیده

منابع مشابه

Entropy-based Consensus for Distributed Data Clustering

A Hybrid Privacy Preserving Approach in Data Mining

Repeated Record Ordering for Constrained Size Clustering

Privacy Preserving Distributed K-Means Clustering in Malicious Model Using Zero Knowledge Proof

A centralized privacy-preserving framework for online social networks

عنوان ژورنال:

اشتراک گذاری